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(54) Method and apparatus for facilitating routing protocol redundancy in a network element 



(57) An embodiment of a method (1 00) and appara- 
tus (400) for synchronizing routing protocol information 
associated with a plurality of routing modules (402,404) 
of a network element (400) Is disclosed herein. The 
method includes an operation for adding an additional 
routing module (404) to a network element (400). The 
network element includes an existing routing module 
(402) having an existing collection of routing protocol 



information (414) associated therewith. In response to 
adding the additional routing module to the network el- 
ement, an operation is performed for imparting the ex- 
isting collection of routing protocol information upon the 
additional routing module. After updating the existing 
routing module (402) with new routing protocol informa- 
tion, an operation is performed for updating the addition- 
al routing module (404) with this new routing protocol 
Information. 



FIG. 4 



Network Element 
400 



Active Routing Module 
402 



Routing 
Information 
Oatabase 
414 



BQP 
Task 
410 



TCP Task 
406 



< 

CM 

t: 

CO 
CO 

Q. 

Hi 



Receive 




Transmit 


Queue 




Queue 


410 




420 



Inactive Routing Module 
404 



Routing 
Information 
Database 
416 



BQP 
Task 
412 



TCP Task 
408 



Transmit 
Queue 
424 



Receive Queue 
422 


Pending 
Portion 
426 




Ready 
Portion ■ 
428 



Line Card 
405 




Printed by Jouve. 75001 PARIS (FR) 



5/30/2007, EAST Version: 2.1.0.14 



1 



EP 1 331 772 A1 



2 



Description 

FIELD OF THE DISCLOSURE 

[0001 J The present invention relates generally to net- 
work communications, and more particularly to synchro- 
nization of redundant communication tasks. 

BACKGROUND 

[0002] Data communication protocols serve to facili- 
tate transmission and reception of data across commu- 
nication networks. For example, transmission control 
protocol (TC P), Internet protocol (IP), border gateway 
protocol (BGP), asynchronous transfer mode (ATM), 
and various other protocols facilitate communication of 
data between two or more locations in a communication 
network. Through the use of such protocols, communi- 
cation of data across a plurality of communication net- 
works may be facilitated, even though two or more of 
the networks comprise different operating systems and 
architectures. 

[0003] The Open Systems Interconnect (OSI) Refer- 
ence Model developed by the International Standards 
Organization (ISO) is generally used to describe the 
structure and function of data communications. The OSI 
Reference Model encompasses seven layers, often re- 
ferred to as a stack or protocol stack, which define the 
functions of data communications protocols. The proto- 
col stack comprises a physical layer, a data link layer, a 
network layer, a transport layer, a session layer, a pres- 
entation layer, and an application layer. A layer does not 
define a single protocol, but rather a data communica- 
tions function that may be performed by any number of 
protocols suitable to the function of that layer. For ex- 
ample, a file transfer protocol and an electronic mall pro- 
tocol provide user services, and are thus part of the ap- 
plication layer. Every protocol communicates with its 
peer, which is a standardized Implementation of the 
Identical protocol In the equivalent layer on a remote 
system. For example, a local electronic mall protocol is 
the peer of a remote electronic mail protocol. As another 
example, BGP on a local router exchanges routing in- 
formation with BGP on a neighboring router. 
[0004] Applications, such as BGP, which require a 
transport protocol to provide reliable data delivery, often 
use TCP because TCP verifies that data is delivered 
across a network (between separate end systems) ac- 
curately and In the proper sequence. TCP provides re- 
liability with a mechanism referred to as Positive Ac- 
knowledgement with Retransmission (PAR). In simplest 
terms, a system with PAR re-transmits the data for which 
it has not received an acknowledgement message from 
a far-end node. Information is communicated between 
cooperating TCP modules in segments. A segment is a 
datagram containing a TCP header and perhaps data. 
The TCP header contains sequence numbers. Control 
information, called a handshake, is exchanged between 



the two endpoints to establish a dialogue before data is 
transmitted. 

[0005] As previously discussed, border gateway pro- 
tocol (BGP) typically runs over TCP (e.g., port 179). 
5 BGP version 4 (BGP4) is the current de facto exterior 
routing protocol for inter-domain (autonomous systems) 
routing. BGP is a protocol used to advertise routes be- 
tween networks of routers, e.g., between a Service Pro- 
vider's network and a Carrier's network. Routers at the 
edges of these networks exchange BGP messages, 
which could affect hundreds of thousands of routes. If 
the BGP process at one of these edge routers termi- 
nates (e.g., because of a restart, hardware failure, soft- 
ware upgrade, etc.), service on the routes between the 
networks is usually affected. The termination also caus- 
es additional BGP messages to be exchanged between 
other edge routers to update information about available 
routes. Consequently, the termination results in a period 
of route instability and unavailability of the affected rout- 
er, which consequences are desirable to avoid. Further- 
more, the termination will often result in a flood of re- 
routing messages being sent into the network, thus ad- 
versely affecting performance of the network. 
[0006] A conventional BGP redundancy technique for 
addressing BGP process failures involves configuring 
two or more routers from different vendors In parallel. 
The objective of such a technique is to reduce the po- 
tential for BGP process failures by relying on the as- 
sumption that one of the routers will survive at least 
some of the time a particular set of circumstances that 
might lead to failure of another router. For example, at 
least one of the routers would ideally exhibit immunity 
to failure such as those that might be caused by an of- 
fending message, a hardware fault, or a software fault. 
That is, it is assumed that routers from different vendors 
are susceptible to different types of failures. This type 
of conventional BGP redundancy technique is generally 
expensive due to the inherent cost of the multiple routers 
and because using equipment from multiple vendors 
causes additional operation, support, network manage- 
ment, and training costs. Additionally, this type of con- 
ventional BGP redundancy technique requires addition- 
al BGP messages to be exchanged to move the routes 
onto the tandem router, thus increasing cost, complex- 
ity, and network traffic. The attached routers still notice 
that the first router has disappeared and then route 
around it. Accordingly, it is desirable to avoid the disad- 
vantages associated with such a conventional BGP re- 
dundancy technique. 

[0007] A graceful restart mechanism for a router is an- 
other conventional technique for addressing BGP proc- 
ess failures. Such a graceful restart mechanism is pro- 
posed in an internet Engineering Task Force (IETF) draft 
entitled "Graceful Restart Mechanism for BGP". In this 
proposal, a router has the capability of preserving its for- 
warding state (routes) over a BGP restart, the ability to 
notify Its peer routers of this capability and the ability to 
notify its peer routers of an estimated time for restart 
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completion before it initiates such a restart. Upon de- 
tecting that the BGP process of the router has terminat- 
ed (i.e., a failed router) and In response to receiving a 
corresponding notification, the peer routers do not send 
new best routes to accommodate for the failed router 
unless it fails to restart within the specified time limit. 
[0008J Such a graceful restart mechanism requires 
that the peer routers be able to interpret and respond to 
the restart notification. Additionally, while the failed rout- 
er is restarting it cannot process routing updates that 
would normally be received. Consequently, it becomes 
out of date during the period of unavailability, which is 
followed by a burst of up dates once back in service 
These updates cause increased "chum" in the routing 
tables of other routers, which affects performance of the 
network and should therefore be avoided. Even worse, 
routing loops or "blackholes" may form In this period of 
unavailability. Such "blackholes" occur when a route is 
advertised as available, but when the corresponding 
router is not actually configured to support such a route, 
resulting in loss of packets intended to be communicat- 
ed over that route. Furthermore, the router may not ac- 
tually be coming back Into service. Also, since a graceful 
restart mechanism allows the specified time limit for 
routers to be restarted, waiting that amount of time can 
increase the time it takes to detect a failure and route 
around the failed router. Additionally, Implementation of 
such a grateful restart mechanism requires protocol ex- 
tensions to BGP to which all routers aware of the failure 
must adhere in order to support the graceful restart 
mechanism. Accordingly, it is desirable to avoid the dis- 
advantages associated with a graceful restart mecha- 
nism. 

[0009] Therefore, facilitating synchronization of pro- 
tocol tasks and related Information on redundant routing 
modules of a network element in a manner that enables 
limitations associated with conventional redundancy 
techniques to be overcome is useful. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0010] 

FIG. 1 is flow chart view depicting a method for syn- 
chronizing TCP tasks running on redundant routing 
modules of a network element in accordance with 
an embodiment of the disclosures made herein. 
FIG. 2 is a flow chart view depicting a method for 
facilitating an activity switch in accordance with an 
embodiment of the disclosures made herein. 
FIG. 3 is a flow chart view depicting a method for 
synchronizing routing protocol information associ- 
ated with a plurality of routing modules of a network 
element in accordance with an embodiment of the 
disclosures herein. 

FIG. 4 is a block diagram view depicting a network 
element 400 capable of carrying out methods in ac- 
cordance with embodiments of the disclosures 



made herein. 

DETAILED DESCRIPTION OF THE FIGURES 

5 [001 1 ] An embodiment of a method and apparatus for 
synchronizing routing protocol information associated 
with a plurality of routing modules of a network element 
is disclosed herein. The method includes an operation 
for adding an additional routing module to a network el- 
ement. The network element includes an existing rout- 
ing module having an existing collection of routing pro- 
tocol information associated therewith. In response to 
adding the additional routing module to the network el- 
ement, an operation is performed for Imparting the ex- 
isting collection of routing protocol Information upon the 
additional routing module. After updating the existing 
routing module with such new routing protocol Informa- 
tion, an operation is performed for updating the addition- 
al routing module with such new routing protocol infor- 
mation. 

[001 2] The disclosures made herein pertain to various 
aspects of facilitating synchronization of redundant rout- 
ing modules in a network element. In accordance with 
embodiments of the disclosures made herein, lower lay- 
er protocol (e.g., Transmission Control Protocol (TCP)) 
and higher layer protocol (e.g., Border Gateway Proto- 
col (BGP)) tasks of a first routing module are synchro- 
nized with respective lower layer protocol (e.g., TCP) 
and higher layer protocol (e.g., BGP) tasks of a second 
routing module. The first routing module and the second 
routing module are redundant routing modules within a 
network element. Protocol information (e.g., TCP pack- 
ets, BGP packets, etc) that is processed on the first rout- 
ing module (e.g., an active one of a plurality of redundant 
routing modules) is similarly processed on the second 
routing module (e.g., an inactive one of the plurality of 
redundant routing modules). Accordingly, such a net- 
work element in accordance with an embodiment of the 
disclosures made herein advantageously comprises re- 
dundant, synchronized routing modules that are capa- 
ble of supporting carrier-grade quality of service over 
networks of various communication protocols (e.g., In- 
ternet Protocol, etc). A lower layer protocol (e.g., TCP) 
packet (which may be referred to as a segment) is not 
necessarily congruent with a higher layer protocol (e.g., 
BGP) packet. For example, it is not necessarily true that 
a TCP packet contains a BGP packet. For example, say 
a node Is transmitting two BGP packets A and B and 
each packet Includes 1 000 bytes. A TCP task will most 
likely transmit portions of these BGP packets In sepa- 
rate TCP packets. For example, a first TCP segment 
may contain 51 2 bytes of data (the first 51 2 bytes of BGP 
packet A), a second TCP segment may contain 512 
bytes of data (the remaining 488 bytes of BGP packet 
A, together with the first 24 bytes of BGP packet B), a 
third TCP segment may contain 512 bytes of data (the 
next 512 bytes of BGP packet B), and finally a fourth 
TCP segment may contain 464 bytes of data (the re- 
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maining 464 bytes of BGP packet B). The foregoing is 
merely an example, and other relationships between 
lower layer protocol packets and higher layer protocol 
packets are entirely possible. 
[0013] Embodiments of the disclosures made herein 
are capable of enabling redundant lower layer protocol 
tasks (e.g., TCP tasks) and higher layer protocol tasks 
(e.g., BGP tasks), thus allowing for an activity switch 
without adversely affecting service. In such embodi- 
ments, when an activity switch Is implemented, disrup- 
tion of service on routes distributed by such higher layer 
and lower layer protocols is limited, if not eliminated. For 
example, after such an activity switch, a newly active 
routing module (i.e., previously the Inactive routing mod- 
ule) processes routing updates that would normally be 
received by a newly inactive routing module (i.e., previ- 
ously the active routing module). Furthermore, the new- 
ly active routing module does not become out of date 
with respect to routing Information maintained on other 
networkelements. In this manner, the network is not bur- 
dened by a burst of updates in response to the activity 
switch. Limiting the burden of such a burst of updates 
eliminates "chum" In the routing tables of network ele- 
ments, thus improving performance of the network. Sig- 
nificantly, service of existing routes is maintained, and 
change to existing routes, deletion of routes, and addi- 
tion of routes can continue uninterrupted; the switchover 
Is transparent to neighboring routers. By being transpar- 
ent to neighboring routers, a technique disclosed herein 
need not require cooperation of neighboring routers to 
enable an activity switch. Accordingly, neighboring rout- 
ers need not be made aware of such an activity switch, 
nor do they need to support protocol extensions to en- 
able such an activity switch. 

[0014] Such embodiments are advantageous in that 
an offending packet of information that results in failure 
of a higher layer protocol task of a first routing module 
does not readily result in failure of the same higher layer 
protocol task of a second routing module that Is redun- 
dant and synchronized with respect to the first routing 
module . One embodiment of a technique for limiting the 
potential for failure of the second routing module from 
the offending packet of information is to maintain a high- 
er layer protocol task (e.g., a BGP task) of an inactive 
one of a plurality of synchronized routing modules (i.e., 
an inactive routing module) at least one higher layer pro- 
tocol packet (e.g., a BGP packet) behind the same high- 
er layer protocol task of an active one of the plurality of 
synchronized routing modules (I.e., the active routing 
module). In this manner, the offending packet is recog- 
nized as such prior to being processed by the higher 
layer protocol task of the inactive routing module, there- 
by avoiding the failure of the higher layer protocol task 
of the Inactive routing module that would otherwise re- 
sult. 

[0015] Another advantageous aspect of embodi- 
ments of the disclosures made herein is that processing 
power of a network element is not adversely affected. 



Specifically, synchronization and redundancy in accord- 
ance with embodiments ofthe disclosures made herein 
are facilitated in an efficient and effective manner. Ac- 
cordingly, a significant majority of processing power of 
s the network element Is available for performing primary 
tasks of the network element (e.g., switching, routing, 
etc). 

[0016] Still another advantageous aspect of embodi- 
ments of the disclosures made herein is that such em- 
10 bodiments are less costly to Implement and maintain 
than conventional solutions. Such embodiments do not 
require redundant network elements, but rather redun- 
dant routing modules within a particular network ele- 
ment. In some embodiments, the redundant routing 
is modules are Implemented identically, thus reducing 
cost. For example, similar software may be executed 
within each of the redundant routing modules. In other 
embodiments, differently-implemented redundant rout- 
ing modules may be used. 

[0017] It should be understood that embodiments of 
the present invention may be practiced with a variety of 
higher layer protocols. While BGP packets are men- 
tioned in many places herein, it should be understood 
that routes can also arrive from other protocols (e.g., 
Open Shortest Path First (OSPF)), or due to configura- 
tion changes (e.g., static routes). Not only are routes 
kept is sync between an active routing module and an 
inactive routing module, but so Is configuration. The 
configuration can also change on-the-fly (e.g., a BGP 
peer may be added or removed at any time), it should 
be noted that a higher layer protocol may be used for 
advertising routes, but may also be used for withdrawing 
routes (e.g., a BGP packet may also specify a route to 
withdraw). 

[0018] Turning now to the figures, a method 100 for 
synchronizing TCP tasks running on redundant routing 
modules of a network element in accordance with an 
embodiment of the disclosures made herein is depicted 
in FIG. 1 . The method Is performed by a network ele- 
ment preferably comprising a line card 134, an active 
routing module 1 36, and an inactive routing module 1 38. 
Various steps ofthe method are illustrated as being per- 
formed by the line card 1 34, by the active routing module 
136, and by the inactive routing module 138. The meth- 
od 100 begins at an operation 102 where a line card of 
a network element forwards a Protocol Data Unit (PDU), 
or a copy thereof, for reception by an active routing mod- 
ule and an Inactive routing module of the network ele- 
ment. An Internet Protocol routing module is an example 
of both the active and the inactive routing modules. An 
apparatus capable of providing routing functionality (I. 
e., a router) is an example of the network element. In 
other embodiments, the network element need not be 
implemented on a router, but may by implemented on 
one or more other network devices. As an example, for 
embodiments wherein the higher layer protocol packets 
are Mufti- Protocol Label Switching (MPLS) packets, the 
network element may be so implemented. Accordingly, 
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the active routing module and inactive routing module 
may be considered more generically to be simply an ac- 
tive module and an Inactive module. The active routing 
module performs an operation 104 for receiving the 
PDU while the Inactive routing module effectively ig- 
nores (e.g., receives but not process) the PDU in oper- 
ation 140. 

[0019] After receiving the PDU, the active routing 
module performs an operation 1 06 for extracting a TCP 
packet encapsulated within the PDU. The TCP packet 
extracted from the PDU is hereinafter referred to as the 
inbound TCP packet. The active routing module TCP 
task performs an operation 110 for receiving the first 
copy of the Inbound TCP packet. 
[0020] After the active routing moduie receives the 
first copy of the inbound TCP packet, the active routing 
module TCP task performs an operation 1 1 4 for storing 
the first copy of the inbound TCP packet in a receive 
queue associated with the active routing module TCP 
task. After operation 114, the active routing module per- 
forms operation 142 to make a determination as to 
whether or not the inbound TCP packet should be for- 
warded to the inactive routing module 138. If It Is deter- 
mined that the inbound TCP packet should not be for- 
warded, the process continues to operation 144, where 
the inbound TCP packet Is not forwarded. If it Is deter- 
mined that the inbound TCP packet should be forward- 
ed, the process continues to operations 108 and 120. In 
operation 1 08, the active routing module forwards a first 
copy of the Inbound TC P packet for reception by a TCP 
task of the active routing module (i.e., the active routing 
module TCP task) and a second copy of the inbound 
TCP packet for reception by a TCP task of the inactive 
routing module (I.e., the inactive routing module TCP 
task). In at least one embodiment, operation 114 Is per- 
formed before operation 108, while, in at least one em- 
bodiment, operation 108 is performed before operation 
1 1 4. It is important to note that the active routing module 
processes the incoming TCP packet and then, If appro- 
priate, forwards the incoming TCP packet (along with 
other information) to the inactive routing module. Some 
incoming TCP packets, for example, acknowledge- 
ments that contain no data, need not be forwarded to 
the inactive routing module. The inactive routing module 
TCP task performs an operation 112 for receiving the 
second copy of the inbound TCP packet. Similarly, after 
the Inactive routing module receives the second copy of 
the inbound TCP packet, the inactive routing module 
TCP task performs an operation 1 1 6 for storing the sec- 
ond copy of the inbound TCP packet in a receive queue 
associated with the inactive routing module TCP task. 
The operation 116 for storing the second copy of the in- 
bound TCP packet in a receive queue associated with 
the inactive routing module TCP task includes initially 
storing second copy of the inbound TCP packet in a 
pending portion of the receive queue associated with the 
inactive routing module TCP task (i.e., the pending por- 
tion of the Inactive routing module receive queue). 



[0021 ] In operation 1 20, a BGP task of the active rout- 
ing module (I.e., the active routing module BGP task) 
facilitates recordation of the peer network element from 
which the inbound TCP packet was received. As dis- 
5 cussed below in greater detail in reference to FIG. 2, 
such a record of the peer network element from which 
the inbound TCP packet was received is used for facil- 
itating an activity switch from the active routing module 
to the inactive routing module if a failure occurs while 
10 processing the BGP packet. After operation 1 20, the ac- 
tive routing module BGP task performs an operation 1 1 8 
for processing the BGP message. 
[0022] After operation 118, an operation 121 is per- 
formed for determining whether processing of the BGP 
fs message contained in the first copy of the inbound TCP 
packet Is performed successfully. When the operation 
1 1 8 for processing a BGP message contained in the first 
copy of the inbound TCP packet Is successfully per- 
formed, the inactive routing module TCP task performs 
an operation 122 for storing the second copy of the in- 
bound TCP packet in a ready portion of the receive 
queue associated with the inactive routing module TCP 
task (i.e., the ready portion of the inactive module re- 
ceive queue). In at least one embodiment of the opera- 
tion 122, the operation 122 for storing the second copy 
of the inbound TCP packet in the ready portion of the 
Inactive routing module receive queue includes forward- 
ing the second copy of the inbound TCP packet from the 
pending portion to the ready portion of the inactive rout- 
ing module receive queue. Upon a determination in op- 
eration 121 that processing of the BGP message con- 
tained in the first copy of the inbound TCP packet was 
performed successfully, the second copy of the inbound 
TCP packet can be immediately moved from the pend- 
ing portion to the ready portion of the Inactive routing 
module receive queue, but, in at least one embodiment, 
It is advantageous to cause such action to occur at a 
later time for performance reasons. For example, an in- 
struction for the inactive routing module to perform op- 
eration 122 may be included within other information 
destined for the inactive routing module to avoid the 
need to send the instruction separately and to minimize 
the amount of information being sent to the inactive rout- 
ing module and the amount of processing required by 
the inactive routing module. 

[0023] After, and only after, the second copy of the 
inbound TCP packet is stored in the ready portion of the 
inactive routing module receive queue, the inactive rout- 
ing module BGP task performs an operation 124 for 
processing the BGP message contained In the second 
copy of the inbound TCP packet. The active routing 
module performs an operation 126 for issuing an ac- 
knowledgement message for designating that the in- 
bound TCP packet has been received. In at least one 
embodiment, operation 126 is performed after operation 
122, while in at least one other embodiment, operation 
126 is performed before operation 122, as long as It is 
performed after operation 116. The operation for Initially 
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storing the second copy of the inbound TCP packet in 
the pending portion of the Inactive routing module re- 
ceive queue and then in the ready portion of the inactive 
routing module receive queue enables the second copy 
of the inbound TCP packet to remain unprocessed by 
the inactive routing module BGP task until the contents 
of the inbound TCP packet Is determined to be non-of- 
fensive (e.g., not causing a BGP task failure) via the ac- 
tive routing module BGP task successfully processing 
the first copy of the inbound TCP packet. In this manner, 
the inactive routing module BGP task processes a par- 
ticular BGP messages after the active routing module 
BGP task processes the particular BGP me ssage. 
[0024] In at least one embodiment of the active rout- 
ing module and inactive routing module BGP tasks, 
such BGP tasks are precluded from receiving partial 
TCP packets from the TCP task. Such partial packets 
may contain partial BGP messages therein, potentially 
causing synchronization problems when an activity 
switch is implemented. It is contemplated herein that a 
TCP task may be configured such it precludes an asso- 
ciated BGP task from recognizing that information of a 
packet is being received until the Information comprises 
a full packet. A socket option exists for enabling such 
functionality. 

[0025] By enabling the BGP message contained in 
the second copy of the inbound TCP packet to be proc- 
essed only after second copy of the TCP packet Is 
stored in the ready portion of the Inactive routing module 
receive queue, it is assured that such processing of the 
BGP message contained in the second copy of the TCP 
packet wilt take place only after the BGP message con- 
tained in the first copy of the TCP packet is successfully 
processed by the active routing module BGP task. It 
should be understood that the BGP messages con- 
tained in the first copy and the second copy of the in- 
bound TCP packet are essentially identical (i.e., the 
same BGP message). Accordingly, the potential for fail- 
ure of the active and inactive routing module BGP tasks 
resulting from the same BGP message is substantially 
reduced, if not eliminated. 

[0026] Issuing the acknowledgement message for 
designating that the inbound TCP packet has been re- 
ceived (e.g., operation 126) only after the second copy 
of the TCP packet has been stored in the Inactive routing 
module receive queue pending portion (e.g., operation 
116) assures that the inactive routing module will not fail 
to receive any TCP packets, even in the event of an ac- 
tivity switch. Thus, In the event that an activity switch 
does not occur, the BGP message contained in the first 
copy of the TCP packet is successfully processed by the 
active routing module BGP task. Moreover, upon an ac- 
tivity switch, the TCP communication with the peer net- 
work element that transmitted a TCP packet including 
an offending BGP message is terminated. This opera- 
tional sequence ensures that a TCP packet including an 
offending BGP message is not processed after such 
message results In failure of the active routing module 



BGP task. Thus, redundancy robustness is enhanced. 
[0027] Turning now to update messages being trans- 
mitted from the network element for reception by its peer 
network elements, it will be appreciated that redundancy 

5 can also be provided where such outbound update mes- 
sages are concerned. For example, in response to re- 
ceiving a BGP message designating a new route, a 
route update message is transmitted from the network 
element for reception by one or more of its peer network 

10 elements for notifying such peer network elements of 
the new route. 

[0028] Accordingly, in response to the active routing 
module receiving such types of BGP messages that ne- 
cessitate an outbound update message or receiving a 
15 route from another protocol (e.g., OSPF or ISIS), or in 
response to an Internal event (e.g., a configuration 
change, such as, for example, adding a static route) for 
which an update message should be generated, an op- 
eration 128, FIG. 1 , is performed for storing a first copy 
of an outbound BGP packet encapsulated within one or 
more TCP packets in a transmit queue of the active rout- 
ing module (i.e., the active routing module transmit 
queue). Operation 128 occurs after operation 118. Also, 
in response to the active routing module receiving such 
types of BGP messages that involve an outbound up- 
date message, an operation 130 is performed for storing 
a second copy of the outbound BGP packet encapsu- 
lated within one or more TCP packets in a transmit 
queue of the inactive routing module (I.e., the inactive 
routing module transmit queue). Similar to receive 
queue functionality as disclosed herein, in at least one 
embodiment of the active routing module and inactive 
routing module transmit queues, such transmit queues 
are precluded from storing partial BGP packets, An op- 
eration 132 is performed for forwarding the first copy of 
the outbound BGP packet encapsulated within one or 
more TCP packets from the active routing module trans- 
mit queue for reception by one or more peer network 
elements only after the second copy of the outbound 
BGP packet encapsulated within one or more TCP 
packets is stored in the inactive routing module transmit 
queue. In this manner, retransmission and packet se- 
quencing functionality are maintained after an activity 
switch from the active routing module to the inactive 
routing module. 

[0029] Referring back to the operation 121 , this oper- 
ation Is also capable of determining whether processing 
of the BGP message contained in the first copy of the 
inbound TCP packet is not successfully performed. In 
response to the BGP message contained in the first 
copy of the inbound TCP packet being not being proc- 
essed successfully by the active routing module BGP 
task, an activity switch is facilitated, and the process is 
directed to an entry point "A." The activity switch trans- 
fers on-line operations of the network element from the 
previously active routing module (now the inactive rout- 
ing module) to a newly active routing module (previously 
the inactive routing module). 
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[0030] FIG. 2 depicts a method 200 for facilitating an 
activity switch in accordance with an embodiment of the 
disclosures made herein. The method 200 pertains to 
an activity switch resulting from an offending inbound 
TCP packet. A TCP packet including an offending BGP 
message is one example of the offending TCP packet. 
[0031 J At an entry point "A" corresponding to process- 
ing of an errant BGP packet of an Inbound TCP packet 
(i.e., one for which satisfactory error handling has not 
otherwise been provided), the method 200 begins with 
an error handling routine invoked by a system controller 
implements an operation 202 for identifying a peer net- 
work element from which the offending inbound BGP 
packet was received. One embodiment of identifying the 
identified peer network element Includes reading/ac- 
cessing a record generated in response to the operation 
120, FIG. 1 , for facilitating recordation of the peer net- 
work from which the offending inbound TCP packet was 
received. It should be noted that, in at least one embod- 
iment, when recording the network element from which 
a packet has been received, the peer network element 
of the BGP packet is recorded, not that of the TCP seg- 
ment. It is possible that the BGP peer network element 
and TCP peer network element may be different. BGP 
may have a session with a neighbor that requires mul- 
tiple TCP hops to reach. As such, the identified peer net- 
work element is identified with respectto the higher layer 
protocol packet (e.g., the BGP packet). 
[0032] The system controller may, for example, be a 
control element, such as a processor, coupled to the net- 
work element or incorporated within the network ele- 
ment. For example, the system controller may be imple- 
mented as a process that reads the record to determine 
the peer network element from which the packet was 
received and to initiate the termination of the associated 
BGP session on the inactive routing module. In at least 
one embodiment, this process is contained within the 
active routing module. When required to terminate a 
session, the active routing module (e.g., the system con- 
troller contained within the active routing module) com- 
municates with the inactive routing module as to which 
peering session to terminate. 
[0033] After identifying the peer network element from 
which the offending inbound BGP packet was received 
(I.e. , the identified peer network element), the error han- 
dling routing performs an operation 204 for initiating ter- 
mination of a BGP session associated with the Identified 
peer network element and an operation 206 for initiating 
termination of a TCP session associated with the iden- 
tified peer network element. Since, in at least one em- 
bodiment, initiating termination of a BGP session will In- 
herently initiate termination of a TCP session, opera- 
tions 204 and 206 may optionally be performed as a sin- 
gle operation. Likewise, such a single operation may re- 
sult in performance of operation 208, which may inher- 
ently result in performance of operation 210. After initi- 
ating termination of the BGP and TCP sessions associ- 
ated with the identified peer network element, the newly 



active routing module TCP task performs an operation 
210 for terminating the TCP session associated with the 
identified peer network element, and the soon-to-be- 
newly active routing module BGP performs an operation 

5 208 for terminating the BGP session associated with the 
identified peer network element. In response to facilitat- 
ing termination of the TCP session associated with the 
identified peer network element, the newly active routing 
module TCP task performs an operation 21 2 for purging 

10 the offending Inbound TCP packet from the receive 
queue of the newly active routing module. The actual 
switching of functional operations is facilitated after the 
TCP and BGP sessions are terminated and the offend- 
ing Inbound TCP packet is purged from the receive 

13 queue of the newly active routing module. Because the 
TCP session has been terminated, even If the offending 
inbound TCP packet has not been acknowledged, it will 
not be re-sent, thereby avoiding a failure of the newly 
active routing module. 

20 [0034] As an additional precaution, TCP and BGP 
task sessions with the identified peer network element 
are re-established after an operation 214 is performed 
for restarting the newly inactive module and until after 
an operation 21 6 is performed for synchronizing existing 

25 routing-related information of the newly inactive routing 
module with the newly active routing module. Such rout- 
ing-related Information may include Information stored 
in a routing information database, as well as other Infor- 
mation, such as configuration information (e.g., static 

so configuration information) and state information (e.g., 
dynamic state information). In response to synchroniz- 
ing such existing routing-related information, the newly 
active routing module BGP task performs an operation 
220 for re-establishing a BGP session with the identified 

35 peer network element. To re-establish a BGP session In 
accordance with operation 220, the newly active routing 
module TCP task performs an operation 218 for re-es- 
tablishing a TCP session with the Identified peer net- 
work element. In this manner, risk associated with re- 

40 establishing such task sessions with the identified net- 
work element without a redundant routing module being 
in place are reduced, if not eliminated. Optionally, in at 
least one embodiment, BGP and TCP task sessions are 
maintained with other peer network elements besides 

45 the identified peer network element. 

[0035] FIG. 3 depicts a method 300 for synchronizing 
routing protocol Information associated with a plurality 
of routing modules of a network element in accordance 
with an embodiment of the disclosures herein. By syn- 

50 chronizing such routing protocol Information, redundan- 
cy in accordance with the disclosures made herein may 
be implemented. Such synchronization contributes to 
enabling an activity switch from a first routing module of 
the network element to a second routing module of the 

55 network element in an essentially transparent manner 
with respect to peer network elements. 
[0036] The method 300 begins with an inactive rout- 
ing module performing an operation 302 for receiving a 
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copy of existing routing protocol information from an ac- 
tive routing module. The operation 302 Is performed In 
response to the inactive routing module being an addi- 
tional routing module that is added to a network element 
Including the active routing module. Because the active- 
routing module is an existing, in-use routing module of 
the network element, the active routing module has such 
existing routing protocol Information associated there- 
with prior to addition of the Inactive routing module. For 
example, it may be the case that at least a portion of the 
routing protocol information was dynamically estab- 
lished in the existing routing module over a period of 
time prior to the addition of the additional routing module 
to the network element. Examples of routing protocol in- 
formation Include TCP related state information, BGP 
configuration, BGP routing tables, and route state Infor- 
mation (e.g., designation that a route has been adver- 
tised to peer network elements). 
[0037] In response to the inactive routing module re- 
ceiving such existing routing information, an operation 
304 is performed for updating inactive routing module 
records associated with such routing protocol informa- 
tion. An embodiment for updating such inactive routing 
module records associated with such existing routing 
protocol information includes updating a routing infor- 
mation database of the inactive routing module. In one 
embodiment of the inactive routing module, the inactive 
routing module does not include any existing routing 
protocol Information (e.g., the inactive routing module is 
a new routing module being put into service). In another 
embodiment of the inactive routing module, the inactive 
routing module includes existing routing protocol infor- 
mation that is being updated. 

[0036] At some point in time after the inactive routing 
module is added to the network element and during the 
normal course of operation of the active routing module, 
the active routing module performs an operation 306 for 
receiving a first copy of new routing protocol Information 
(newly-received routing protocol information) from one 
ormore peernetwork elements. In response to receiving 
such newly-received routing protocol information, the 
active routing module performs an operation 308 for up- 
dating active routing module records associated with 
such newly-received routing protocol information, an 
operation 312 for forwarding a second copy of such 
newly-received routing protocol information for recep- 
tion by the inactive routing module, and an operation 
310 for acknowledging receipt of such newly received 
routing protocol Information. Thus, acknowledgement Is 
provided to the one or more peer network elements from 
which the new routing protocol information was received 
after a copy of such new routing protocol information (or 
the portion thereof for which the acknowledgement is 
being provided) has been forwarded to the inactive rout- 
ing module (i.e., the additional routing module). Afterthe 
active routing module forwards such newly-received 
routing protocol information for reception by the Inactive 
routing module, the Inactive routing protocol performs 



an operation 31 4 for receiving such newly-received rout- 
ing protocol information and an operation 31 6 for updat- 
ing inactive routing module records associated with 
such routing protocol information. It should be noted that 
5 operations 304 and 31 6 may be performed as separate 
operations or combined into a single operation. A TCP 
packet including a BGP message is an example of such 
newly-received routing protocol information during the 
normal course of operation of the active routing module. 
10 [0039] Referring now to FIG. 4, a network element 
400 capable of carrying out methods in accordance with 
embodiments of the disclosures made herein is depict- 
ed. Specifically, the network element 400 is capable of 
carrying out redundancy and synchronization function- 
's allty in accordance with the disclosures made herein. 
For example, the network element 400 is capable of car- 
rying out the methods disclosed herein (e.g., the meth- 
ods 1 00, 200 and 300). An apparatus capable of provid- 
ing routing functionality (e.g., a router) is an example of 
20 the network element 400. 

[0040] The network element 400 includes an active 
routing module 402 (i.e., the first routing module), an 
inactive routing module 404 (i.e., the second routing 
module), and a line card 405 connected between the 
25 active and inactive routing modules (402, 404). The line 
card facilitates routing a respective copy of each in- 
bound TCP packet (e.g., via forwarding of correspond- 
ing Protocol Data Units (PDUs)). However, the TCP task 
of the Inactive routing module 402 ignores such TCP 
30 packets (e.g., does not process the PDUs) while the 
TCP task of the active routing module 402 processes 
such TCP packets. 

[0041] The active routing module 402 and the inactive 
routing module 404 are capable of facilitating redundant 
35 functionality In according with the disclosures made 
herein. The active routing module 402 and the inactive 
routing module 404 each Include respective TCP tasks 
(406, 408), respective BGP tasks (410, 412) and re- 
spective routing information databases (414,416). The 
40 TCP tasks (406, 408) are each examples of lower layer 
protocol tasks. The BGP tasks (410,412) are each ex- 
amples of higher layer protocol tasks. It is contemplated 
herein that BGP tasks (410, 412) may be substituted 
with other protocols that use TCP to exchange messag- 
es es (e.g., multi-protocol label switching (MPLS)). 

[0042] The TCP task 406 of the active routing module 
Includes a receive queue 41 8 and a transmit queue 420. 
The TCP task 408 of the Inactive routing module 404 
includes a receive queue 422 and a transmit queue 424. 
50 The receive queue 422 includes a pending portion 426 
and a ready portion 428. The pending portion 426 and 
the ready portion 428 of the inactive routing module re- 
ceive queue 422 facilitate functionality as depicted in 
FIG. 1. Specifically, the pending portion 426 and the 
55 ready portion 428 of the inactive routing module receive 
queue 422 enables a particular copy of a TCP packet to 
remain unprocessed by the inactive routing module 
BGP task 412 until the contents of such TCP packet is 
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determined to be non-offensive (e.g., not causing a BGP 
task failure) by the BGP task 410 of the active routing 
module 402. This, the Inactive routing module BGP task 
412 processes a particular BGP message after the ac- 
tive routing module BGP task 410 processes the partic- 
ular BGP message. 

[0043] In some embodiments of the Inactive routing 
module, the Inactive routing module 404 does not re- 
ceive flow control updates from the active routing mod- 
ule 402. Thus, it is theoretically possible for the inactive 
routing module receive queue 422 to overflow. To re- 
duce this possibility, the inactive routing module receive 
queue 422 is preferably much larger than the active rout- 
ing module receive queue 41 8 in such embodiments. 
However, It should be understood that the Inactive rout- 
ing module 404 does less work than the active routing 
module 402 (e.g., the flooding responsibilities are great- 
ly reduced, the TCP/IP stack is not transmitting data, 
etc.). Accordingly, there should not be a steady state 
possibility where the inactive routing module receive 
queue 422 continues to grow without limit. 
[0044] It should be understood that the active routing 
module 402 is capable of supporting functionality dis- 
closed herein In association with the inactive routing 
module 404 and the inactive routing module 404 is ca- 
pable of supporting functionality disclosed herein in as- 
sociation with the active routing module 402. According- 
ly, in the event of an activity switch in accordance with 
the disclosures made herein, the active routing module 
.402 (I.e., the newly inactive routing module) provides 
functionality previously provided by the inactive routing 
module 404 (i.e., the newly active routing module) and 
the inactive routing module 404 provides functionality 
previously provided by the active routing module 402. 
For example, after an activity switch, the active routing 
module 402 provides functionality associated with the 
pending queue 426 and ready queue 428 of the inactive 
routing module 404. 

[0045] In accordance with at least one embodiment of 
the disclosures made herein, the BGP tasks of the active 
routing module 402 and the inactive routing module 404 
do not queue any transmit data on a per-peer (i.e., per 
socket) basis. One reason that the BGP tasks no longer 
queue on a per-peer basis is because data queued in 
the BGP task would not be guaranteed of delivery after 
an activity switch. Another reason Is that synchroniza- 
tion of lists of routes which need to be advertised or with- 
drawn would be excessively intensive if BGP task trans- 
mit queues needed searching. 
[0046) It is contemplated herein thatthe active routing 
module transmit queue 420 is enlarged in order to ena- 
ble omission of a transmit queue of the active routing 
module BGP task. That Is, the transmit queue 420 of the 
active routing module TCP task 406 needs to be large 
enough to ensure that transmissions continue between 
successive periods of processing of advertised or with- 
drawn routes. 

[0047] Because the BGP tasks (41 0, 41 2) of the ac- 



tive and inactive routing modules (402, 404) cannot 
queue any transmit data, an operation for transmitting 
data to the active routing module TCP task 406 must 
succeed. Otherwise, the active routing module BGP 

5 task 41 0 would have to queue such transmit data, which 
It preferably does not do. To ensure that the operation 
for transmitting data to the active routing module TCP 
task 406 succeeds, the active routing module BGP task 
41 0 first ensures that sufficient space exists in the trans- 

10 mit queue 420 associated with the active routing module 
TCP task 406. In one embodiment, ensuring that such 
sufficient space exists is accomplished via a read in 
shared memory. To this end, the TCP task 406 of the 
active routing module 402 maintains a table of free 

15 space in the active routing module transmit queue 420. 
However, in other embodiments, other techniques may 
be used for ensuring that such sufficient space exists. 
[0048] Referring now to data processor programs in 
accordance with an embodiment of the disclosures 

20 made herein, a data processor program controls at least 
a portion of the operations associated with synchroniz- 
ing higher layer protocol tasks (e.g., BGP) and lower lay- 
er protocol tasks (e.g., TCP) running on redundant rout- 
ing modules of a network element. In this manner, the 

25 data processor program controls at least a portion of the 
operations necessary to facilitate routing module syn- 
chronization functionality in a manner consistent with 
the disclosures made herein. The term data processor 
program is defined herein to refer to computer software, 

30 data processor algorithms or any other type of instruc- 
tion code capable of controlling operations associated 
with a data processor. A microprocessor, microcontrol- 
ler, microcomputer, digital signal processor, state ma- 
chine, logic circuitry, and/or any device that manipulates 

35 digital information based on operational instruction, or 
in a predefined manner are examples of a data proces- 
sor. 

[0049] A data processor program in accordance with 
an embodiment of the disclosures made herein is 

40 processible by a data processor of an active and/or in- 
active routing module of a network element. A copy of 
the data processor program may be resident on each of 
the routing elements in a network element. Furthermore, 
each copy of the data processor program may be ac- 

45 cessible by a data processor of the respective routing 
module from a memory apparatus of the respective rout- 
ing module (e.g. , RAM , ROM, virtual memory, hard drive 
memory, etc.) or from a peripheral apparatus such as a 
diskette, a compact disk, an external data storage de- 

so vice and the like. 

[0050] A data processor program accessible from an 
apparatus by a data processor is defined herein as a 
data processor program product, ft is contemplated 
herein that the data processor program product may 

55 comprise more than one data processor programs each 
accessible from respective apparatuses. It is further 
contemplated herein that each one of a plurality of data 
processor programs may be accessed by a different re- 
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spective one of a plurality of data processors. For ex- 
ample, a first data processor and a second data proc- 
essor (e.g., of a leaf node and a root node), respectively, 
may access a first data processor program and a sec- 
ond data processor program, respectively, from a first 
apparatus and a second apparatus (e.g., a first memory 
device and a second memory device), respectively. 
[0051] In the preceding detailed description, refer- 
ence has been made to the accompanying drawings 
that form a part hereof, and in which are shown by way 
of illustration specific embodiments in which the inven- 
tion may be practiced. These embodiments have been 
described in sufficient detail to enable those skilled in 
the art to practice the invention, and it is to be under- 
stood that other embodiments may be utilized and that 
logical, mechanical, chemical and electrical changes 
may be made without departing from the spirit or scope 
of the invention. To avoid detail not necessary to enable 
those skilled in the art to practice the invention, the de- 
scription omits certain information known to those of skill 
in the art. The preceding detailed description is, there- 
fore, not to be taken in a limiting sense, and the scope 
of the present Invention is defined only by the appended 
claims. 



Claims 

1 . A method for facilitating routing protocol redundan- 
cy In a network element, comprising: 

adding an additional routing module to the net- 
work element, wherein the network element in- 
cludes an existing routing module having an ex- 
isting collection of routing protocol information 
associated therewith; and 
synchronizing a higher layer protocol task of the 
additional routing module with a higher layer 
protocol task of the existing routing module in 
response to adding the additional routing mod- 
ule. 

2. The method of claim 1 wherein synchronizing the 
higher layer protocol task of the additional routing 
module with the higher layer protocol task of the ex- 
isting routing module includes: 

Imparting the existing collection of routing pro- 
tocol information upon the additional routing 
module in response to adding the additional 
routing module to the network, element; 
updating the existing routing module with new 
routing protocol information; and 
updating the additional routing module with the 
new routing protocol information. 

3. The method of claim 2 wherein: 



updating the additional routing module with the 
new routing protocol Information is performed 
after a higher layer protocol task of the existing 
routing module processes the new routing pro- 
s tocol information; and 

the new routing protocol Information Is a higher 
layer protocol packet. 

4. The method of claim 3 wherein: 

10 

the higher layer protocol packet specifies a 
route to advertise; and 

processing the higher layer protocol packet In- 
cludes advertising the route. 

15 

5, The method of claim 2, further comprising: 

forwarding at least a portion of the new routing 
protocol information to the additional routing 
20 module; and 

acknowledging receipt of the at least the por- 
tion of the new routing protocol information, 
wherein the step of forwarding the at least the 
portion of the new routing protocol information 
25 to the additional routing module occurs prior to 

the step of acknowledging receipt of the at least 
the portion of the new routing protocol informa- 
tion. 

30 6. The method of claim 2 wherein the new routing pro- 
tocol information includes at least one of a BGP 
message, TCP task- related state information, BGP 
routing tables and route state information. 

35 7. The method of claim 1 wherein the existing collec- 
tion of routing protocol information includes at least 
one of BGP message, TCP task-related state infor- 
mation, BGP routing tables and route state informa- 
tion. 

40 

8. The method of claim 1 wherein adding the addition- 
al routing module includes adding the additional 
routing module after the existing routing module has 
been in operation for a period of time and wherein 

45 at least a portion of the routing protocol information 
was dynamically established over the period of 
time. 

9. Apparatus for facilitating routing protocol redundan- 
ce? cy In a network element, comprising: 

an existing routing module having an existing 
collection of routing protocol information asso- 
ciated therewith; and 

an additional routing module coupled to the ex- 
isting routing module, wherein a higher layer 
protocol task of the additional routing module 
is synchronized with a higher layer protocol 
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task of the existing routing module in response 
to coupling the additional routing module to the 
existing routing module. 



10. The apparatus of claim 9 wherein synchronization 5 
of the higher layer protocol task of the additional 
routing module with the higher layer protocol task 
of the existing routing module includes imparting 
the existing collection of routing protocol informa- 
tion upon the additional routing module In response 10 
to adding the additional routing module to the net- 
work element, updating the existing routing module 
with the new routing protocol information, and up- 
dating the additional routing module with the new 
routing protocol Information. is 



11. The apparatus of claim 10 wherein updating of the 
additional routing module with the new routing pro- 
tocol information is performed after a higher layer 
protocol task of the existing routing module proc- 20 
esses the new routing protocol information and 
wherein the new routing protocol information is a 
higher layer protocol packet. 



12. The apparatus of claim 11 wherein the higher layer 25 
protocol packet specifies a route to advertise and 
processing the higher layer protocol packet in- 
cludes advertising the route. 

13. The apparatus of claim 10 wherein the existing rout- 30 
ing module forwards at least a portion of the new 
routing protocol information to the additional routing 
module prior to acknowledging receipt of the at least 

the portion of the new routing protocol information. 

35 

14. The apparatus of claim 10 wherein the new routing 
protocol information includes at least one of a BGP 
message, TCP task-related state information, BGP 
routing tables and route state information. 

40 

15. The apparatus of claim 9 wherein the existing col- 
lection of routing protocol information includes at 
least one of BGP message, TCP task-related state 
Information, BGP routing tables and route state in- 
formation. 45 



16. The apparatus of claim 9 wherein the additional 
routing module Is added after the existing routing 
module has been in operation for a period of time 
and wherein at least a portion of the routing protocol so 
information was dynamically established over the 
period of time. 
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